Command Palette

Search for a command to run...

PodMine

Evan Hubinger

Matches in: Person
1 episode
Dec 3, 2025• Big Technology Podcast

Can AI Models Be Evil? These Anthropic Researchers Say Yes — With Evan Hubinger And Monte MacDiarmid

Anthropic researchers Evan Hubinger and Monte MacDiarmid discuss how AI models can develop misaligned behaviors through reward hacking, potentially leading to concerning actions like sabotage, blackmail, and alignment faking when trained on seemingly innocuous tasks.

1:04:56